Using randomization to create (nearly) identical groups
GVPT399F: Power, Politics, and Data
Our goal
Next best thing
Instead, we should try to make two groups that are as similar as possible to each other prior to treatment.
The magic of randomization
Perfectly random assignment does this very well!
Don’t take my word for it
Imagine we have a group of 1,000 individuals. We know the following about them:
Height
Weight
Eye colour
Our group
# A tibble: 1,000 × 4
id height weight eye_colour
<int> <dbl> <dbl> <chr>
1 1 175. 55.6 Blue
2 2 176. 86.6 Green
3 3 161. 56.1 Green
4 4 171. 106. Blue
5 5 177. 86.1 Green
6 6 173. 88.8 Green
7 7 166. 82.0 Grey
8 8 179. 70.7 Brown
9 9 168. 91.8 Blue
10 10 167. 76.6 Brown
# ℹ 990 more rows
Random assignment
I’m now going to flip (an imaginary, computer-generated) coin for each of these 1,000 individuals to assigned them to group A or B:
# A tibble: 1,000 × 5
id height weight eye_colour group
<int> <dbl> <dbl> <chr> <fct>
1 1 175. 73.4 Blue B
2 2 170. 80.2 Blue A
3 3 176. 65.3 Blue B
4 4 175. 65.4 Green B
5 5 166. 55.2 Brown A
6 6 162. 93.1 Green B
7 7 169. 80.0 Green A
8 8 171. 68.5 Brown B
9 9 156. 73.6 Brown A
10 10 174. 64.6 Green A
# ℹ 990 more rows
rand_group |>count(group, eye_colour) |>ggplot(aes(x = n, y =reorder(eye_colour, n), fill = group)) +geom_bar(position ="dodge", stat ="identity") +labs(x ="Count",y ="Eye color",fill ="Group")
Making sure this wasn’t a fluke
Let’s re-run this:
# A tibble: 1,000 × 5
id height weight eye_colour group
<int> <dbl> <dbl> <chr> <fct>
1 1 159. 72.0 Blue A
2 2 172. 100. Brown B
3 3 170. 88.6 Brown A
4 4 176. 88.7 Blue B
5 5 168. 86.8 Green A
6 6 183. 80.6 Brown B
7 7 163. 79.5 Brown A
8 8 174. 64.3 Green B
9 9 171. 80.2 Green A
10 10 169. 84.3 Blue A
# ℹ 990 more rows